This dataset included are:
- [df_] Anonymized data of tests' results to classify tweets for topics or sentiment toward public health measures. The tests were: 5 annotators (m1 to m5), a keyword-based for topic or a neural sentiment analysis model for sentiment (algorithm), and a large language model (gpt). The two sampling periods (1: December 2020 to February 2021 or 2: June to August 2021) are also included. For the neural sentiment analysis models (supporting or against), multiple thresholds are included (0.2 to 0.6).
- [ct_] Cross tabulation patterns of the tests' results to classified tweets for topics or sentiment toward public health measures. The tests were: 5 annotators, a keyword-based for topic or a neural sentiment analysis model for sentiment, and a large language model. Frequencies are for period 1 (P1: December 2020 to February 2021), period 2 (P2: June to August 2021), and all (P1+P2) tweets.
- [TEMPmodel] Models used for Bayesian latent class analysis: 
-- TEMPmodel4_2pop.bug: final model including 2 populations and pairwise covariance (conditional dependence between the tests), with a multinomial distribution imposition
-- TEMPmodel2_2pop.bug: model for sensitivity analyses including 2 populations and assuming conditional independence
-- TEMPmodel4.bug: model for sensitivity analyses using all data as 1 population and including pairwise covariance (conditional dependence between the tests), with a multinomial distribution imposition
-- TEMPmodel1_2pop.bug: model for sensitivity analyses including 2 populations with uniform beta(1,1) priors for sensitivity and specificity


